68 research outputs found
A Coordinate-Descent Algorithm for Tracking Solutions in Time-Varying Optimal Power Flows
Consider a polynomial optimisation problem, whose instances vary continuously
over time. We propose to use a coordinate-descent algorithm for solving such
time-varying optimisation problems. In particular, we focus on relaxations of
transmission-constrained problems in power systems.
On the example of the alternating-current optimal power flows (ACOPF), we
bound the difference between the current approximate optimal cost generated by
our algorithm and the optimal cost for a relaxation using the most recent data
from above by a function of the properties of the instance and the rate of
change to the instance over time. We also bound the number of floating-point
operations that need to be performed between two updates in order to guarantee
the error is bounded from above by a given constant
CoCoA: A General Framework for Communication-Efficient Distributed Optimization
The scale of modern datasets necessitates the development of efficient
distributed optimization methods for machine learning. We present a
general-purpose framework for distributed computing environments, CoCoA, that
has an efficient communication scheme and is applicable to a wide variety of
problems in machine learning and signal processing. We extend the framework to
cover general non-strongly-convex regularizers, including L1-regularized
problems like lasso, sparse logistic regression, and elastic net
regularization, and show how earlier work can be derived as a special case. We
provide convergence guarantees for the class of convex regularized loss
minimization objectives, leveraging a novel approach in handling
non-strongly-convex regularizers and non-smooth loss functions. The resulting
framework has markedly improved performance over state-of-the-art methods, as
we illustrate with an extensive set of experiments on real distributed
datasets
Sensitivity analysis of the early exercise boundary for American style of Asian options
In this paper we analyze American style of floating strike Asian call options
belonging to the class of financial derivatives whose payoff diagram depends
not only on the underlying asset price but also on the path average of
underlying asset prices over some predetermined time interval. The mathematical
model for the option price leads to a free boundary problem for a parabolic
partial differential equation. Applying fixed domain transformation and
transformation of variables we develop an efficient numerical algorithm based
on a solution to a non-local parabolic partial differential equation for the
transformed variable representing the synthesized portfolio. For various types
of averaging methods we investigate the dependence of the early exercise
boundary on model parameters
Randomized coordinate descent methods for big data optimization
This thesis consists of 5 chapters. We develop new serial (Chapter 2), parallel (Chapter
3), distributed (Chapter 4) and primal-dual (Chapter 5) stochastic (randomized) coordinate
descent methods, analyze their complexity and conduct numerical experiments on synthetic
and real data of huge sizes (GBs/TBs of data, millions/billions of variables).
In Chapter 2 we develop a randomized coordinate descent method for minimizing the sum
of a smooth and a simple nonsmooth separable convex function and prove that it obtains
an ε-accurate solution with probability at least 1 - p in at most O((n/ε) log(1/p)) iterations,
where n is the number of blocks. This extends recent results of Nesterov [43], which cover
the smooth case, to composite minimization, while at the same time improving the complexity
by the factor of 4 and removing ε from the logarithmic term. More importantly, in contrast
with the aforementioned work in which the author achieves the results by applying the method
to a regularized version of the objective function with an unknown scaling factor, we show
that this is not necessary, thus achieving first true iteration complexity bounds. For strongly
convex functions the method converges linearly. In the smooth case we also allow for arbitrary
probability vectors and non-Euclidean norms. Our analysis is also much simpler.
In Chapter 3 we show that the randomized coordinate descent method developed in Chapter
2 can be accelerated by parallelization. The speedup, as compared to the serial method, and
referring to the number of iterations needed to approximately solve the problem with high
probability, is equal to the product of the number of processors and a natural and easily
computable measure of separability of the smooth component of the objective function. In the
worst case, when no degree of separability is present, there is no speedup; in the best case, when
the problem is separable, the speedup is equal to the number of processors. Our analysis also
works in the mode when the number of coordinates being updated at each iteration is random,
which allows for modeling situations with variable (busy or unreliable) number of processors.
We demonstrate numerically that the algorithm is able to solve huge-scale l1-regularized least
squares problems with a billion variables.
In Chapter 4 we extended coordinate descent into a distributed environment. We initially
partition the coordinates (features or examples, based on the problem formulation) and assign
each partition to a different node of a cluster. At every iteration, each node picks a random
subset of the coordinates from those it owns, independently from the other computers, and in
parallel computes and applies updates to the selected coordinates based on a simple closed-form
formula. We give bounds on the number of iterations sufficient to approximately solve the
problem with high probability, and show how it depends on the data and on the partitioning.
We perform numerical experiments with a LASSO instance described by a 3TB matrix.
Finally, in Chapter 5, we address the issue of using mini-batches in stochastic optimization
of Support Vector Machines (SVMs). We show that the same quantity, the spectral norm of
the data, controls the parallelization speedup obtained for both primal stochastic subgradient
descent (SGD) and stochastic dual coordinate ascent (SCDA) methods and use it to derive novel
variants of mini-batched (parallel) SDCA. Our guarantees for both methods are expressed in
terms of the original nonsmooth primal problem based on the hinge-loss.
Our results in Chapters 2 and 3 are cast for blocks (groups of coordinates) instead of
coordinates, and hence the methods are better described as block coordinate descent methods.
While the results in Chapters 4 and 5 are not formulated for blocks, they can be extended to
this setting
Reinforcement Learning for Solving Stochastic Vehicle Routing Problem
This study addresses a gap in the utilization of Reinforcement Learning (RL)
and Machine Learning (ML) techniques in solving the Stochastic Vehicle Routing
Problem (SVRP) that involves the challenging task of optimizing vehicle routes
under uncertain conditions. We propose a novel end-to-end framework that
comprehensively addresses the key sources of stochasticity in SVRP and utilizes
an RL agent with a simple yet effective architecture and a tailored training
method. Through comparative analysis, our proposed model demonstrates superior
performance compared to a widely adopted state-of-the-art metaheuristic,
achieving a significant 3.43% reduction in travel costs. Furthermore, the model
exhibits robustness across diverse SVRP settings, highlighting its adaptability
and ability to learn optimal routing strategies in varying environments. The
publicly available implementation of our framework serves as a valuable
resource for future research endeavors aimed at advancing RL-based solutions
for SVRP.Comment: 14 pages, accepted to ACML2
- …